Skip to content

Conversation

@jan-service-account
Copy link

Updates dev branch with latest release (b6834) from ggml-org/llama.cpp

compilade and others added 5 commits October 23, 2025 16:31
* convert : begin handling pre-quantized models

* convert : fix conversion from FP8 for Deepseek-V3.1-Base
…gml-org#16751)

This commit add the trust_remote_code=True argument when loading models
using AutoConfig, AutoTokenizer, and AutoModelForCausalLM for the run
original model script.

The motivation for this is that some models require custom code to be
loaded properly, and setting trust_remote_code=True avoids a prompt
asking for user confirmation:
```console
(venv) $ make causal-run-original-model
The repository /path/to/model contains custom code which must be
executed to correctly load the model. You can inspect the repository
content at /path/to/model.

Do you wish to run the custom code? [y/N] N
```

Having this as the default seems like a safe choice as we have to clone
or download the models we convert and would be expecting to run any
custom code they have.
* webui: support q URL parameter

Fixes ggml-org#16722
I’ve checked that it works with Firefox’s AI tools

* webui: apply suggestions from code review

Co-authored-by: Aleksander Grygier <[email protected]>

* chore: update webui static build

---------

Co-authored-by: Aleksander Grygier <[email protected]>
…st (ggml-org#16742)

* Fix CUDA grid launch condition for large block_nums.y

* add backend ops test

* reduce test  repetitions
@jan-service-account jan-service-account merged commit d9811d9 into dev Oct 25, 2025
3 checks passed
@jan-service-account jan-service-account deleted the update-dev-from-master-2025-10-25-00-33 branch October 25, 2025 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants